Support async `ChannelMonitorUpdate` persist for claims against closed channels #3414

TheBlueMatt · 2024-11-18T20:00:13Z

~~Based on #3413~~, this finally implements async ChannelMonitorUpdate persistence against closed channels. After all the prep, its...not all that bad. Would still like to write an additional test for the last second-to-commit though the changes in that commit do show it has some coverage.

codecov · 2024-11-18T20:08:53Z

Codecov Report

Attention: Patch coverage is 88.25911% with 29 lines in your changes missing coverage. Please review.

Project coverage is 89.73%. Comparing base (1a8bf62) to head (5d5971c).

Files with missing lines	Patch %	Lines
lightning/src/ln/channelmanager.rs	80.57%	17 Missing and 10 partials ⚠️
lightning/src/ln/chanmon_update_fail_tests.rs	98.13%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #3414   +/-   ##
=======================================
  Coverage   89.73%   89.73%           
=======================================
  Files         130      130           
  Lines      107793   107860   +67     
  Branches   107793   107860   +67     
=======================================
+ Hits        96727    96793   +66     
+ Misses       8662     8656    -6     
- Partials     2404     2411    +7

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

TheBlueMatt · 2024-12-06T19:18:39Z

Rebased on #3413 after merge. I promise this one isn't quite as bad @jkczyz / @valentinewallace.

jkczyz

Just a few things that jumped out at me. Will need to take another pass at this.

lightning/src/ln/channelmanager.rs

jkczyz · 2024-12-10T17:12:48Z

lightning/src/ln/channelmanager.rs

-						let latest_update_id = monitor.get_latest_update_id();
-						update.update_id = latest_update_id.saturating_add(1);
+						let latest_update_id = monitor.get_latest_update_id().saturating_add(1);
+						update.update_id = latest_update_id;


Was this behavior wrong previously? (i.e., using the previous latest_update_id below)

The semantics have changed - because we're setting the value at update creation we need the map to contain the value after this update, rather than the value before this update (which was used to set the update's value when its applied).

lightning/src/ln/channelmanager.rs

valentinewallace

Will take another pass as well but overall looks good

lightning/src/ln/channelmanager.rs

valentinewallace · 2024-12-10T21:46:20Z

lightning/src/ln/chanmon_update_fail_tests.rs

+		// If the A<->B channel was closed before we reload, we'll replay the claim against it on
+		// reload, causing the `PaymentForwarded` event to get replayed.
+		let evs = nodes[1].node.get_and_clear_pending_events();
+		assert_eq!(evs.len(), if close_chans_before_reload { 2 } else { 1 });


For my understanding, does this PR make the duplicate-PaymentForwarded issue worse or is it the same as before?

Uhhh, if you're using async monitor updating maybe? It shouldn't change anything for non-async monitor update users, but of course for async monitor update users it should only change things (incl a new spurious event) in cases where what we did before was just unsafe.

lightning/src/ln/channelmanager.rs

valentinewallace · 2024-12-11T16:42:46Z

lightning/src/ln/channelmanager.rs

-			}
+		if let Some((_, funding_txo, _, update)) = $shutdown_res_mut.monitor_update.take() {
+			handle_new_monitor_update!($self, funding_txo, update, $peer_state,
+				$channel_context, REMAIN_LOCKED_UPDATE_ACTIONS_PROCESSED_LATER);
 		}
 		// If there's a possibility that we need to generate further monitor updates for this
 		// channel, we need to store the last update_id of it. However, we don't want to insert


~10 lines down, it looks like we're setting the update id in closed_channel_monitor_update_ids to the update_id prior to the below background event update_id, but it seems like after 6489550 the update_id in said map should no longer ignore pending background events?

Grrr, good catch.

valentinewallace · 2024-12-11T18:01:30Z

lightning/src/ln/channelmanager.rs

@@ -7237,37 +7237,42 @@ where

 		let mut preimage_update = ChannelMonitorUpdate {
 			update_id: 0, // set in set_closed_chan_next_monitor_update_id
-			counterparty_node_id: prev_hop.counterparty_node_id,
+			counterparty_node_id: None,


What's the reason for this change?

Mostly spurious - I think when moving this code around for the 10th time at some time before then I didn't have the counterparty id readily available so just removed it since it doesn't matter much (its just used to keep the ChannelMonitor's counterparty_node_id field updated if the ChannelMonitor was build before that field was added). I went ahead and re-added it thought.

lightning/src/ln/chanmon_update_fail_tests.rs

valentinewallace · 2024-12-11T19:06:26Z

lightning/src/ln/channelmanager.rs

+				let in_flight_updates = per_peer_state.in_flight_monitor_updates
+					.entry(*funding_txo)
+					.or_insert_with(Vec::new);
+				debug_assert!(!in_flight_updates.iter().any(|upd| upd == update));
+				in_flight_updates.push(update.clone());


Missing test coverage for this

Not sure what is missing test coverage here? If I add a panic here a ton of tests fail.

Oh, I had commented the lines out. Probably all tests passed because updates are duplicated in pending background events?

Right, its duplicated there and handle_new_monitor_update ultimately happily treats it as an actually-new update and inserts it into the queue. I spent a bit trying to figure out how to test this but I don't actually think its externally visible.

valentinewallace · 2024-12-11T19:07:44Z

lightning/src/ln/channelmanager.rs

+				let in_flight_updates = per_peer_state.in_flight_monitor_updates
+					.entry(*funding_txo)
+					.or_insert_with(Vec::new);
+				debug_assert!(!in_flight_updates.iter().any(|upd| upd == update));
+				in_flight_updates.push(update.clone());


Did you mean to make this part of the DRYing commit?

valentinewallace

Basically LGTM, I'm good with a squash

lightning/src/ln/channelmanager.rs

TheBlueMatt · 2024-12-12T19:58:02Z

Squashed:

$ git diff-tree -U1 c19905929 cd658400c
diff --git a/lightning/src/ln/channelmanager.rs b/lightning/src/ln/channelmanager.rs
index cfd5de805..58a04a3e5 100644
--- a/lightning/src/ln/channelmanager.rs
+++ b/lightning/src/ln/channelmanager.rs
@@ -13620,3 +13620,3 @@ where
 				}
-				let mut per_peer_state =  per_peer_state.get(counterparty_node_id)
+				let mut per_peer_state = per_peer_state.get(counterparty_node_id)
 					.expect("If we have pending updates for a channel it must have an entry")

lightning/src/ln/channelmanager.rs

jkczyz · 2024-12-12T22:09:13Z

lightning/src/ln/channelmanager.rs

-		// Note that the below is race-y - we set the `update_id` here and then drop the peer_state
-		// lock before applying the update in `apply_post_close_monitor_update` (or via the
-		// background events pipeline). During that time, some other update could be created and
-		// then applied, resultin in `ChannelMonitorUpdate`s being applied out of order and causing
-		// a panic.
-		Self::set_closed_chan_next_monitor_update_id(&mut *peer_state, prev_hop.channel_id, &mut preimage_update);


Looks like this was suppose to be applied to 94d0735?

Looks like this fixup (now 7ce4e99) is still applied to the wrong commit?

Oops, not sure how I did that. The commit applies cleanly one up.

lightning/src/ln/channelmanager.rs

Now that we track the latest `ChannelMonitorUpdate::update_id` for each closed channel in `PeerState::closed_channel_monitor_update_ids`, we should always have a `PeerState` entry for the channel counterparty any time we go to claim an HTLC on a channel, even if its closed. Here we make this a hard assertion as we'll need to access that `PeerState` in the coming commits to track in-flight updates against closed channels.

In c99d3d7 we added a new `apply_post_close_monitor_update` method which takes a `ChannelMonitorUpdate` (possibly) for a channel which has been closed, sets the `update_id` to the right value to keep our updates well-ordered, and then applies it. Setting the `update_id` at application time here is fine - updates don't really have an order after the channel has been closed, they can be applied in any order - and was done for practical reasons as calculating the right `update_id` at generation time takes a bit more work on startup, and was impossible without new assumptions during claim. In the previous commit we added exactly the new assumption we need at claiming (as it's required for the next few commits anyway), so now the only thing stopping us is the extra complexity. In the coming commits, we'll move to tracking post-close `ChannelMonitorUpdate`s as in-flight like any other updates, which requires having an `update_id` at generation-time so that we know what updates are still in-flight. Thus, we go ahead and eat the complexity here, creating `update_id`s when the `ChannelMonitorUpdate`s are generated for closed-channel updates, like we do for channels which are still live. We also ensure that we always insert `ChannelMonitorUpdate`s in the pending updates set when we push the background event, avoiding a race where we push an update as a background event, then while its processing another update finishes and the post-update actions get run.

In d1c340a we added support in `handle_new_monitor_update!` for handling updates without dropping locks. In the coming commits we'll start handling `ChannelMonitorUpdate`s "like normal" for updates against closed channels. Here we set up the first step by adding a new `POST_CHANNEL_CLOSE` variant on `handle_new_monitor_update!` which attempts to handle the `ChannelMonitorUpdate` and handles completion actions if it finishes immediately, just like the pre-close variant.

One of the largest gaps in our async persistence functionality has been preimage (claim) updates to closed channels. Here we finally implement support for this (for updates at runtime). Thanks to all the work we've built up over the past many commits, this is a well-contained patch within `claim_mpp_part`, pushing the generated `ChannelMonitorUpdate`s through the same pipeline we use for open channels. Sadly we can't use the `handle_new_monitor_update` macro wholesale as it handles the `Channel` resumption as well which we don't do here.

On startup, we walk the preimages and payment HTLC sets on all our `ChannelMonitor`s, re-claiming all payments which we recently claimed. This ensures all HTLCs in any claimed payments are claimed across all channels. In doing so, we expect to see the same payment multiple times, after all it may have been received as multiple HTLCs across multiple channels. In such cases, there's no reason to redundantly claim the same set of HTLCs again and again. In the current code, doing so may lead to redundant `PaymentClaimed` events, and in a coming commit will instead cause an assertion failure.

One of the largest gaps in our async persistence functionality has been preimage (claim) updates to closed channels. Here we finally implement support for this (for updates which are generated during startup). Thanks to all the work we've built up over the past many commits, this is a fairly straightforward patch, removing the immediate-completion logic from `claim_mpp_part` and adding the required in-flight tracking logic to `apply_post_close_monitor_update`. Like in the during-runtime case in the previous commit, we sadly can't use the `handle_new_monitor_update` macro wholesale as it handles the `Channel` resumption as well which we don't do here.

This moves the common `if during_startup { push background event } else { apply ChannelMonitorUpdate }` pattern by simply inlining it in `handle_new_monitor_update`.

jkczyz

LGTM. Feel free to squash, IMO.

TheBlueMatt added this to the 0.1 milestone Nov 18, 2024

TheBlueMatt force-pushed the 2024-09-async-persist-claiming-from-closed-chan branch from 8fca379 to fb56bd3 Compare November 24, 2024 15:05

This was referenced Nov 25, 2024

Async Persistence TODOs #3052

Open

Correct return value for apply_post_close_monitor_update #3426

Merged

TheBlueMatt force-pushed the 2024-09-async-persist-claiming-from-closed-chan branch 2 times, most recently from d581b8d to 76b3a99 Compare December 6, 2024 19:18

jkczyz reviewed Dec 10, 2024

View reviewed changes

valentinewallace reviewed Dec 10, 2024

View reviewed changes

TheBlueMatt force-pushed the 2024-09-async-persist-claiming-from-closed-chan branch from 76b3a99 to 60653bb Compare December 11, 2024 16:40

valentinewallace reviewed Dec 11, 2024

View reviewed changes

TheBlueMatt force-pushed the 2024-09-async-persist-claiming-from-closed-chan branch from 60653bb to c199059 Compare December 11, 2024 21:14

valentinewallace reviewed Dec 12, 2024

View reviewed changes

lightning/src/ln/channelmanager.rs Show resolved Hide resolved

lightning/src/ln/channelmanager.rs Outdated Show resolved Hide resolved

lightning/src/ln/channelmanager.rs Show resolved Hide resolved

TheBlueMatt force-pushed the 2024-09-async-persist-claiming-from-closed-chan branch from c199059 to cd65840 Compare December 12, 2024 19:57

valentinewallace previously approved these changes Dec 12, 2024

View reviewed changes

jkczyz reviewed Dec 12, 2024

View reviewed changes

TheBlueMatt added 3 commits December 13, 2024 16:50

f move to reduce diff

6fca688

TheBlueMatt dismissed valentinewallace’s stale review via 8be0b8e December 13, 2024 16:52

TheBlueMatt force-pushed the 2024-09-async-persist-claiming-from-closed-chan branch from cd65840 to 8be0b8e Compare December 13, 2024 16:52

TheBlueMatt added 6 commits December 13, 2024 17:30

f DRY update_id building

5d8acc2

DRY the pre-startup ChannelMonitorUpdate handling

5d5971c

This moves the common `if during_startup { push background event } else { apply ChannelMonitorUpdate }` pattern by simply inlining it in `handle_new_monitor_update`.

TheBlueMatt force-pushed the 2024-09-async-persist-claiming-from-closed-chan branch from 8be0b8e to 5d5971c Compare December 13, 2024 17:30

jkczyz reviewed Dec 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support async `ChannelMonitorUpdate` persist for claims against closed channels #3414

Support async `ChannelMonitorUpdate` persist for claims against closed channels #3414

TheBlueMatt commented Nov 18, 2024 •

edited

Loading

codecov bot commented Nov 18, 2024 •

edited

Loading

TheBlueMatt commented Dec 6, 2024 •

edited

Loading

jkczyz left a comment

jkczyz Dec 10, 2024

TheBlueMatt Dec 11, 2024

valentinewallace left a comment

valentinewallace Dec 10, 2024

TheBlueMatt Dec 11, 2024

valentinewallace Dec 11, 2024

TheBlueMatt Dec 11, 2024

valentinewallace Dec 11, 2024

TheBlueMatt Dec 11, 2024

valentinewallace Dec 11, 2024

TheBlueMatt Dec 11, 2024

valentinewallace Dec 12, 2024

TheBlueMatt Dec 12, 2024

valentinewallace Dec 11, 2024

valentinewallace left a comment

TheBlueMatt commented Dec 12, 2024

jkczyz Dec 12, 2024

jkczyz Dec 13, 2024

TheBlueMatt Dec 13, 2024

jkczyz left a comment

Support async ChannelMonitorUpdate persist for claims against closed channels #3414

Are you sure you want to change the base?

Support async ChannelMonitorUpdate persist for claims against closed channels #3414

Conversation

TheBlueMatt commented Nov 18, 2024 • edited Loading

codecov bot commented Nov 18, 2024 • edited Loading

Codecov Report

TheBlueMatt commented Dec 6, 2024 • edited Loading

jkczyz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valentinewallace left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valentinewallace left a comment

Choose a reason for hiding this comment

TheBlueMatt commented Dec 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz left a comment

Choose a reason for hiding this comment

Support async `ChannelMonitorUpdate` persist for claims against closed channels #3414

Support async `ChannelMonitorUpdate` persist for claims against closed channels #3414

TheBlueMatt commented Nov 18, 2024 •

edited

Loading

codecov bot commented Nov 18, 2024 •

edited

Loading

TheBlueMatt commented Dec 6, 2024 •

edited

Loading